Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Font adaptation of an HMM-based OCR system

Identifieur interne : 000781 ( Main/Exploration ); précédent : 000780; suivant : 000782

Font adaptation of an HMM-based OCR system

Auteurs : Kamel Ait-Mohand [France] ; Laurent Heutte [France] ; Thierry Paquet [France] ; Nicolas Ragot [France]

Source :

RBID : Pascal:10-0429723

Descripteurs français

English descriptors

Abstract

We create a polyfont OCR recognizer using HMM (Hidden Markov models) models of character trained on a dataset of various fonts. We compare this system to monofont recognizers showing its decrease of performance when it is used to recognize unseen fonts. In order to fill this gap of performance, we adapt the parameters of the models of the polyfont recognizer to a new dataset of unseen fonts using four different adaptation algorithms. The results of our experiments show that the adapted system is far more accurate than the initial system although it does not reach the accuracy of a monofont recognizer.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Font adaptation of an HMM-based OCR system</title>
<author>
<name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Université François Rabelais Tours, LI EA 2101, 64 avenue Jean Portalis</s1>
<s2>37200 Tours</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Centre-Val de Loire</region>
<region type="old region" nuts="2">Région Centre</region>
<settlement type="city">Tours</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0429723</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0429723 INIST</idno>
<idno type="RBID">Pascal:10-0429723</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000160</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000617</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000155</idno>
<idno type="wicri:doubleKey">0277-786X:2010:Ait Mohand K:font:adaptation:of</idno>
<idno type="wicri:Area/Main/Merge">000786</idno>
<idno type="wicri:Area/Main/Curation">000781</idno>
<idno type="wicri:Area/Main/Exploration">000781</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Font adaptation of an HMM-based OCR system</title>
<author>
<name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Université François Rabelais Tours, LI EA 2101, 64 avenue Jean Portalis</s1>
<s2>37200 Tours</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Centre-Val de Loire</region>
<region type="old region" nuts="2">Région Centre</region>
<settlement type="city">Tours</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Accuracy</term>
<term>Algorithms</term>
<term>Document retrieval</term>
<term>Hidden Markov models</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Probabilistic approach</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Algorithme</term>
<term>Reconnaissance forme</term>
<term>Recherche documentaire</term>
<term>Modèle Markov variable cachée</term>
<term>Reconnaissance optique caractère</term>
<term>Evaluation performance</term>
<term>Précision</term>
<term>Approche probabiliste</term>
<term>0130C</term>
<term>4230S</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Recherche documentaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We create a polyfont OCR recognizer using HMM (Hidden Markov models) models of character trained on a dataset of various fonts. We compare this system to monofont recognizers showing its decrease of performance when it is used to recognize unseen fonts. In order to fill this gap of performance, we adapt the parameters of the models of the polyfont recognizer to a new dataset of unseen fonts using four different adaptation algorithms. The results of our experiments show that the adapted system is far more accurate than the initial system although it does not reach the accuracy of a monofont recognizer.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Centre-Val de Loire</li>
<li>Haute-Normandie</li>
<li>Région Centre</li>
<li>Région Normandie</li>
</region>
<settlement>
<li>Saint-Etienne-du-Rouvray</li>
<li>Tours</li>
</settlement>
<orgName>
<li>Université de Rouen</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Région Normandie">
<name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
</region>
<name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000781 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000781 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:10-0429723
   |texte=   Font adaptation of an HMM-based OCR system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024